Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods
نویسندگان
چکیده
The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet unknown. Robotic arms based on the same mechanical principles may render present day robotic arms obsolete. In this paper, we tackle this control problem using an online reinforcement learning algorithm, based on a Bayesian approach to policy evaluation known as Gaussian process temporal difference (GPTD) learning. Our substitute for the real arm is a computer simulation of a 2-dimensional model of an Octopus arm. Even with the simplifications inherent to this model, the state space we face is a high-dimensional one. We apply a GPTDbased algorithm to this domain, and demonstrate its operation on several learning tasks of varying degrees of difficulty.
منابع مشابه
A New Type-2 Fuzzy Systems for Flexible-Joint Robot Arm Control
In this paper an adaptive neuro fuzzy inference system based on interval Gaussian type-2 fuzzy sets in the antecedent part and Gaussian type-1 fuzzy sets as coefficients of linear combination of input variables in the consequent part is presented. The capability of the proposed method (we named ANFIS2) to function approximation and dynamical system identification is shown. The ANFIS2 structure ...
متن کاملKinematic decomposition and classification of octopus arm movements
The octopus arm is a muscular hydrostat and due to its deformable and highly flexible structure it is capable of a rich repertoire of motor behaviors. Its motor control system uses planning principles and control strategies unique to muscular hydrostats. We previously reconstructed a data set of octopus arm movements from records of natural movements using a sequence of 3D curves describing the...
متن کاملA Flexible Link Radar Control Based on Type-2 Fuzzy Systems
An adaptive neuro fuzzy inference system based on interval Gaussian type-2 fuzzy sets in the antecedent part and Gaussian type-1 fuzzy sets as coefficients of linear combination of input variables in the consequent part is presented in this paper. The capability of the proposed method (we named ANFIS2) for function approximation and dynamical system identification is remarkable. The structure o...
متن کاملControl of Multivariable Systems Based on Emotional Temporal Difference Learning Controller
One of the most important issues that we face in controlling delayed systems and non-minimum phase systems is to fulfill objective orientations simultaneously and in the best way possible. In this paper proposing a new method, an objective orientation is presented for controlling multi-objective systems. The principles of this method is based an emotional temporal difference learning, and has a...
متن کاملHow to Harness the Dynamics of Soft Body: Timing Based Control of a Simulated Octopus Arm via Recurrent Neural Networks
References [1] D. Trivedi, C. D. Rahn, W. M. Kier and I. D. Walker, “Soft robotics: Biological inspiration, state of the art, and future research,” Applied Bionics and Biomechanics, vol. 5 (3), 2008, pp 99-117. [2] G. Sumbre, Y. Gutfreund, G. Fiorito, T. Flash and B. Hochner, “Control of Octopus Arm Extension by a Peripheral Motor Program,” Science, vol. 293, 2001, pp 1845–1848. [3] Y. Yekutiel...
متن کامل